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Abstract 

We present a next-to-leading order (NLO) global DGLAP analysis of nuclear parton 
distribution functions (nPDFs) and their uncertainties. Carrying out an NLO nPDF 
analysis for the first time with three different types of experimental input — deep 
inelastic £+A scattering, Drell-Yan dilepton production in p+A collisions, and inclusive 
pion production in d+Au and p+p collisions at RHIC — we find that these data can well 
be described in a conventional collinear factorization framework. Although the pion 
production has not been traditionally included in the global analyses, we find that the 
shape of the nuclear modification factor i?dAu of the pion p^-spectrum at midrapidity 
retains sensitivity to the gluon distributions, providing evidence for shadowing and 
EMC-effect in the nuclear gluons. We use the Hessian method to quantify the nPDF 
uncertainties which originate from the uncertainties in the data. In this method the 
sensitivity of \ 2 to the variations of the fitting parameters is mapped out to orthogonal 
error sets which provide a user-friendly way to calculate how the nPDF uncertainties 
propagate to any factorizable nuclear cross-section. The obtained NLO and LO nPDFs 
and the corresponding error sets are collected in our new release called EPS09. These 
results should find applications in precision analyses of the signatures and properties 
of QCD matter at the LHC and RHIC. 
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1 Introduction 



In the advent of the LHC becoming online, the knowledge of free and bound nucleon 
parton distribution functions (PDFs) together with their uncertainties, is perhaps more 
topical than ever. Reliable PDF estimates play a significant role in extracting the 
dynamics of the underlying partonic processes from the plethora of detected events in 
a hadron collider like Tevatron or LHC. 

Building on our experience obtained in a series of leading order (LO) analyses 
[U 121 [HH] of the nuclear PDFs (nPDFs), we here extend our work to the next-to- leading 
order (NLO) accuracy in perturbative QCD (pQCD). The earlier studies on this topic 
[HI [6] have already demonstrated that this non-trivial extension is successful, which 
strongly supports the collinear-factorization approach in describing the hard nuclear 
collisions. In addition to confirming this conclusion, our analysis is an extension in two 
important ways: 

First, in contrast to the earlier NLO nPDF analyses that were restricted to include 
only data from deep inelastic lepton-nucleus scattering (DIS) and Drell-Yan (DY) dilep- 
ton production, our analysis contains also inclusive pion production measured at RHIC. 
This additional input improves especially the determination of the gluon densities. 

The second extension concerns the nPDF uncertainty analysis. Although this issue 
has been pursued earlier [71 [3] , it has been very difficult for a general user to actu- 
ally compute the propagation of the nPDF uncertainties to a cross-section of his/her 
interest. The necessary tools for doing such estimates have only been available for the 
groups who did the nPDF analysis and in practice one has hitherto been forced to 
only compare the best-fit predictions given by independent nPDF fits, see e.g. Ref. [8]. 
To improve upon this issue we follow the ideas originally developed by the CTEQ col- 
laboration [91 [10], and construct a collection of nPDF error sets meant to serve as a 
practical way for estimating the nPDF-originating uncertainties in the hard process 
cross-sections in nuclear collisions. The package named EPS09, consisting of our best 
NLO and LO fits together with 30 error sets for both, will be available in [TT) . 

In what follows, we will first introduce the general framework and the method of 
the present analysis in Sec. [2j The obtained results are presented in Sec. [31 which also 
contains comparisons with the data and with earlier global analyses. In Sec. HI we 
present a concrete application of the obtained nPDFs and show how to best estimate 
the PDF-related uncertainties for nuclear cross-sections. In Sec. we give a summary 
and a brief outlook. The technical details of our error analysis are explained in Ap- 
pendix and in Appendix [B] we show how to speed up the numerical calculation for 
pion production. 
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2 Framework and analysis method 



2.1 Definition of nPDFs 

Following the framework of our earlier analyses, we define the bound proton NLO PDFs 
ff(x, Q 2 ) for each parton flavor i by 

tfix, Q 2 ) = R?(x, Q 2 )/f TEQ6 - 1M (x, Q 2 ), (1) 

where Rf{x, Q 2 ) denotes the nuclear modification to the free proton PDF y. CTE Q 61M 



from the CTEQ6.1M set [T2] in the MS scheme. Although more recent sets of free 
proton PDFs exist, this is the latest CTEQ set in the zero- mass variable flavour number 
scheme (ZM-VFNS) which we also use here — a prescription that treats the heavy 
quarks as massless particles, each one active only when the factorization scale Q 2 
exceeds its mass threshold Q 2 > m 2 . We assume that isospin symmetry for bound 
protons and neutrons holds and take the average up and down quark PDF in a nucleus 
A with Z protons to be 

ua(x,Q 2 ) = jf^Q 2 ) + ^^f£(x,Q 2 ) (2) 
d A (x,Q 2 ) = jf^Q^ + ^^f^Q 2 ), 

like in our earlier works [H [2], [31 H] . We neglect the nuclear modifications in Deuterium 
A = 2. Such effects were studied e.g. in Ref. [5] where an order of 1 . . . 2% effects were 
suggested. However, since only a minor improvement in x 2 was found, we expect that 
the error we make by neglecting the nuclear effects in Deuterium should be negligible 
compared to typical uncertainty originating from the experimental errors. 

The parametrization of the nuclear modifications Rf(x,Q 2 ) is performed at the 
charm quark mass threshold Qq = m 2 = 1.69 GeV 2 imposing the momentum and 
baryon number sum rules 
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dxxf l A (x,Q 2 ) = l, / dx[tf v (x,Ql) + fl(x,Q 2 )} =3, (3) 

Jo 

for each nucleus A separately. At higher scales Q 2 > Qq, the nPDFs are obtained 



by solving the DGLAP evolution equations [14] with NLO MS splitting functions 

PSKTHIE?]. 

We solve these equations numerically for both the nuclear and the free proton PDFs, 
following an efficient method suggested in [18]: The PDFs are approximated by 3th 
order polynomials in sufficiently small x-intervals for which the convolution integrals 
involving the splitting kernels can be evaluated analytically. The evolution in Q 2 is 
built as a Taylor expansion around the initial scale Q^. We have tested the accuracy 
of our code against the benchmark PDFs [TJJ], and observed that 9th order Taylor 
expansion with ~ 200 x-points reaches an accuracy well below 1% at Q 2 = 10 4 GeV 2 
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in all x except the largest-x (x > 0.9) sea-quark sector. In practice, for the global fit 
procedure we keep the 9th order Taylor expansion but coarsen the x-grid to speed up the 
computation but ensure that the evolution works properly in the required kinematical 
domain Q 2 < 200 GeV 2 (see Fig. [2] ahead). 

The relevant parameters that specify the evolution are the charm and bottom quark 
masses m c = 1.3 GeV and m& = 4.5 GeV, and the 2- loop strong coupling constant fixed 
at Z-pole to a s (M|) = 0.118. The top quark is not considered in the present analysis. 
These choices coincide with the ones adopted in CTEQ6.1M. In the LO analysis we 
take the free proton baseline from the CTEQ6L1 [13] set, and solve the LO DGLAP 
equations with 1-loop a s . 

2.2 Fitting functions and parameters 

The diversity of the presently available data is not broad enough to allow the nuclear 
modifications for each parton flavor to be independently determined. Thus, following 
our previous works [U [21 El S] we define only three different nuclear corrections at the 
initial scale Qq\ R v for both valence quark distributions, R$ for all sea quarks, and 
Rq for gluons. We parametrize these by piecewize functions 



where the parameters <2j, 6j, q, (3, x a and x e are A-dependent. Joining the three pieces 
together to give a continuous function with vanishing first derivatives at matching 
points x a and x e , eliminates 6 out of the 13 parameters. The remaining ones are ex- 
pressed in terms of the following 6 parameters with obvious interpretations: 



the remaining parameter Co is fixed to cq = 2y e . The roles of these parameters are 
illustrated in Fig. [1] which also roughly indicates which x-regions are meant by the 
commonly used terms: shadowing, antishadowing, EMC-effect, and Fermi-motion. 
The A-dependence of the fit parameters is assumed to follow a power law 



where di = x a , y a . . ., and where the reference nucleus is Carbon, A re { = 12. 

The baryon number and momentum sum rules eliminate yo and p yo for valence 
quarks and gluons, leaving us with 32 free parameters. This is still way too large 
number of parameters to be determined only by the data — further assumptions (based 
on prior experience) are needed to decide which parameters can truly be deduced from 
the data and which can be taken as fixed. 




(4) 



yo 

Xa) Ha 
Xei He 



Height to which shadowing levels as x — > 
Position and height of the antishadowing maximum 
Position and height of the EMC minimum 
Slope factor in the Fermi-motion part, 




(5) 
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Fi gure 1: An illustration of the fit function Rf'(x) and the role of the parameters x a , x e , y®, 
y a , and y e . 

2.3 Experimental input and cross-sections 

The main body of the data in our analysis consists of i + A DIS measurements. We 
also utilize the DY dilepton production data from fixed target p+A collisions at Fermi- 
lab and inclusive neutral-pion production data measured in d+Au and p+p collisions 
at RHIC0. Table [1] lists the sets included in our analysis and Fig. [2] displays their 
kinematical reach in the (x, Q 2 )-plane. We will use the following notation: 

\da l £ s /dQ 2 dx _ F 2 A (x, Q 2 ) 

\do$JdQ*dx ' ^ } ~ F*(x, Q 2 ) 

\da$/dM 2 dx 1>2 

\da^ Y I dM 2 dx\ t 2 

1 d 2 N^/dp T dy min= bias ±d 2 a« Au /dp T dy 
(JVoou) d 2 N™/dp T dy d 2 a7/dp T dy ' 

The kinematical variables in DIS are the Bjorken-x and the virtuality of the photon Q 2 . 
In DY M 2 denotes the invariant mass of the lepton pair, and a?x,2 = \/ M 2 j s e ±v where 
y is the pair rapidity. The inclusive pion production is characterized by the transverse 
momentum px and rapidity y of the outgoing pion. The average number of binary 
nucleon-nucleon collisions (in the centrality class studied) is denoted by (N co \\). In 
this analysis we only consider minimum bias data, and do not focus on the transverse 
coordinate dependence of the nPDFs. The kinematical cuts we impose on the data are 
M 2 , Q 2 > 1.69 GeV 2 for DIS and DY, and px > 1.7 GeV for inclusive pion production. 
All cross-sections are calculated in the collinear factorization formalism folding the 

1 In contrast to our previous analysis [4], we do not include the BRAHMS forward rapidity charged 
hadron d+Au data here. These data will be separately discussed in Sec. H) 



R£ Y (X 1>2 ,M 2 ) EE 
^MAu — 
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Experiment 


Process 


Nuclei 


Data points 


X 2 LO 


X 2 NLO 


Weight 


Rcf. 


SLAC E-139 


DIS 


Hc(4)/D 


21 


6.5 


7.3 


1 


m 


NMC 95, re. 


DIS 


Hc/D 


16 


14.5 


15.6 


5 




NMC 95 


DIS 


Li(6)/D 


15 


23.6 


16.8 


1 


m 


NMC 95, Q 2 dep. 


DIS 


Li(6)/D 


153 


162.2 


157.0 


1 


M 


SLAC E-139 


DIS 


Be(9)/D 


20 


9.6 


12.2 


1 


m 


NMC 96 


DIS 


Be(9)/C 


15 


3.8 


3.8 


1 


m 


SLAC E-139 


DIS 


C(12)/D 


7 


4.1 


3.2 


1 


[H 


NMC 95 


DIS 


C/D 


15 


15.0 


13.8 


1 


22 


NMC 95, Q 2 dep. 


DIS 


C/D 


165 


141.8 


142.0 


1 


[22] 


NMC 95, re. 


DIS 


C/D 


16 


19.3 


20.5 


1 


[21] 

i ! 


NMC 95, re. 


DIS 


C/Li 


20 


30.3 


28.4 


1 


EU 


FNAL-E772 


DY 


C/D 


9 


7.5 


8.3 


1 


[23 


SLAC E-139 


DIS 


Al(27)/D 


20 


10.9 


12.5 


1 


m 


NMC 96 


DIS 


Al/C 


15 


6.0 


5.8 


1 


ESI 


SLAC E-139 


DIS 


Ca(40)/D 


7 


5.0 


4.1 


1 


m 


FNAL-E772 


DY 


Ca/D 


9 


2.9 


3.4 


15 




NMC 95, re. 


DIS 


Ca/D 


15 


25.4 


24.7 


1 




NMC 95, re. 


DIS 


Ca/Li 


20 


23.9 


19.6 


1 


E5 


NMC 96 


DIS 


Ca/C 


15 


6.0 


6.0 


1 


231 


SLAC E-139 


DIS 


Fe(56)/D 


26 


19.1 


23.9 


1 


[20] 


FNAL-E772 


DY 


Fc/D 


9 


2.1 


2.2 


15 


[24] 


NMC 96 


DIS 


Fe/C 


15 


11.0 


10.8 


1 


E3I 


FNAL-E866 


DY 


Fc/Bc 


28 


20.9 


21.7 


1 


m 


CERN EMC 


DIS 


Cu(64)/D 


19 


13.4 


14.8 


1 


m 


riT A y i in 1 t~% t~\ 

SLAC E-139 


DIS 


Ag(l08)/D 


7 


3.8 


2.9 


1 




NMC 96 


DIS 


Sn(ll7)/C 


15 


9.6 


9.1 


1 


m 


NMC 96, Q z dep. 


DIS 


Sn/C 


144 


80.2 


82.8 


10 


[27] 














(x=0. 0125 only) 




FNAL-E772 


DY 


W(l84)/D 


9 


7.0 


6.7 


10 


[23 


FNAL-E866 


DY 


W/Bc 


28 


27.3 


24.2 


1 


m 


SLAC E-139 


DIS 


Au(l97)/D 


21 


11.6 


13.8 


1 


m 


RHIC-PHENIX 


7T° prod. 


d Au / pp 


20 


7.3 


6.3 


20 


m 


NMC 96 


DIS 


Pb/C 


15 


6.90 


7.2 


1 


m 


Total 






929 


738.6 


731.3 







Table 1: The data used in our analysis. The mass numbers are indicated in parentheses 
and the number of data points refers to those falling within our kinematical cuts, Q 2 , M 2 > 
1.69 GeV 2 for DIS and DY, and px > 1.7 GeV for hadron production at RHIC. The quoted 
X 2 values correspond to the unweighted contributions of each data set in LO and NLO. Also 
the weight factors for data sets are shown. 



PDFs fi(x,fi 2 ) with perturbatively calculable pieces a, schematically 

E /^ 2 )®4tr^V) 

i=q,q,g 

E ft^ 2 )®ft^ 2 )®^ l+l ~ +x ^ 2 ) (7) 

i,j,k=q,q,g 

As indicated above, we choose the factorization, renormalization and fragmentation 
scales to be equal, fixed to a characteristic scale in the process: in DIS to the photon 
virtuality Q 2 , in DY to the invariant mass of the lepton pair M 2 , and in the pion 
production to its transverse momentum p\. 

The expressions for double differential DIS cross-sections d 2 a/dxdQ 2 involving the 
NLO coefficients for structure functions Fip are well documented e.g. in Ref. [29]. In 
the case of DY we need the rapidity distributions d 2 a/dM 2 dyn, from which d 2 a jdM 2 dx\^. 
are obtained by the relation = \J M 2 /s e ±y . The relevant partonic cross-sections 
for DY process can be obtained from [30] . 

The numerical calculation of NLO DIS cross-sections involve only one-dimensional 
integrals, posing no problems in evaluating them repeatedly during the fitting process. 
The calculation of the DY cross-section, however, requires evaluation of double integrals 
which are already somewhat slower to compute. The small amount of available DY 
data nevertheless keeps the direct evaluation of the integrals still fast enough, and we 
do not need to resort to use pre-computed K-factors as is done e.g in [5"T] . 

The inclusive pion production embodies the fragmentation functions /^^(z, /x 2 ) 
for parton flavour % to make a pion carrying a fraction z of its momentum. In this 
work we have decided to use the KKP average pion n + + 7r~ parametrization [31] for 
describing the neutral pion 7r° production. We have, however, checked that the results 
obtained with newer ones AKK08 [32] and nDSS [331 El] practically all coincide in the 
case of average pions. In practice, we employ the INCNLO [35] program which includes 
the NLO hadron production code by Aversa et al. [36J. The computation of pion 
production in NLO involves triple integrals which would be very time consuming to 
evaluate directly at every round of iteration, but we can speed up the computation 
significantly by applying the procedure explained in Appendix [B] 

The role played by each data type can be summarized as follows: 

• Deep inelastic scattering 

DIS data has an excellent constraining power for quark distributions in the whole 
range 0.01 < x < 1 spanned by the data. At large x these data are mainly 
sensitive to the valence quarks and at small x to the sea quarks. At moderate 
x, however, close to the antishadowing peak near x = 0.1, such separation of the 
sea and valence quark contributions is not possible on the basis of this type of 
data alone. Despite the direct photon-gluon fusion channel contributing to the 



a 



DIS 



a 



P +A^i+r+x 

DY 



a 



A+B^tt+X 
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Figure 2: The kinematical reach of the DIS, DY and pion production data (see Table H]) 
corresponding to the factorization scale choices explained in the text. The points indicate 
the lowest x and Q 2 values in which partons are sampled in the cross-section calculation. 
Also the BRAHMS data [37] for negatively charged hadron production is shown as it will be 
discussed later in Sec. [U The dashed horizontal line indicates the kinematical cut imposed 
on the data. 

DIS cross-section at NLO, the main gluon constraint provided by DIS still comes 
through the scale evolution of sea quarks that is driven by the gluons. 

• Drell-Yan dilepton production 

The DY data, taken together with DIS, can discriminate between valence and 
sea quarks near x = 0.1. The DY cross-section retains also some sensitivity to 
the sea quarks at larger x but, unfortunately, the precision of the current data 
is not enough to exploit this constraint in its full potential. The invariant mass 
M 2 in our data sample is typically large, M 2 S> Ql, and consequently there are 
sizable evolution effects that constrain the gluons also. 

• Inclusive pion production 

This type of data is usually accompanied by a rather large normalization un- 
certainty stemming, among other sources, from the mo del- dependent quantity 
(N co n). Apart from the normalization uncertainty, the shape of R^au can never- 
theless act as a vital constraint, especially for the nuclear modification for gluons. 
The slight downward trend seen in the large-py part of -RJau ^ midrapidity [2H] 
indicates the need for a gluon EMC-effect, and the smallest-pr part of the -RJau 
would not be satisfactorily reproduced without shadowing (see Fig. [9] ahead). 
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2.4 Definition of the best fit 



A customary way of finding an optimal set of parameters that fits a large number of 
experimental data, is to minimize a global x 2 -function with the help of a routine such 
as MINUIT [38]. The requirement for a x 2 -function is that its magnitude measures the 
degree of agreement between the theory predictions and the experimental data, but 
the exact definition can vary from one analysis to another. In this work we define the 



where N labels the data sets, Di denotes an individual data value with <jj point-to-point 
uncertainty (statistical and systematic uncertainties added in quadrature), and T, is 
the corresponding theory prediction with a parameter set {a}. If an overall (relative) 
normalization uncertainty a™ rm is given, the normalization factor G [1 — cr™ rm , 1 + 
0-normj j g introduced, and determined for each parameter set {a} by minimizing x%- 
Thus, is computed in every round of iteration, when searching for the best overall 
X 2 - The final fx is thus an output of the analysis. 

The overall normalization uncertainty is the simplest example of correlated uncer- 
tainties in the data. If the experiment is able to provide a correlation matrix accounting 
for more complex correlations, the definition of the \ 2 should be extended further, as 
done e.g. in [9J. However, the definition ([§]) is adequate for our purposes, since such 
correlation matrices are presently not available for the nuclear data we use. 

The weight factors (with default value = 1) are needed to amplify con- 
straints provided by those data sets whose contribution to \ 2 would otherwize be too 
small to be noticed by MINUIT. As the weights apparently induce a piece of subjectiv- 
ity to the analysis, some further explanation here is in order: Weighting is justified if 
there is no significant mutual tension between the weighted data set and the others. 
In this case the weights do not introduce a strong bias in the analysis (such bias was 
found e.g. in [1]). For example, the 7r°-data makes only about 1% of the total x 2 , 
and changes in such a small fraction are practically buried in the numerical noise due 
to statistical fluctuations in the wealth of other data. Thus, inclusion of these data 
with a default weight of one would be effectively the same as leaving it completely out 
from the analysis, which would mean loosing all the large-x gluon constraints it can 
provide. Consequently, the gluon EMC-parameter y e would need to be fixed by hand 
which, we think, would induce clearly more subjectivity to the fit than assigning these 
data some extra weight and keeping y e free. Similarly, weighting the DY data enhances 
its residual sensitivity to the large-x sea quarks, which enables us to free the large- £ 
sea parameter y e in a controlled manner. We see these as clear improvements over the 
previous nPDF analyses. 



global x 2 



by pm 





N 




(9) 



S 



In practice, however, finding a reasonable set of weights is largely an iterative 
procedural: having first found a converging fit with a limited amount of free parameters 
and no additional weights, it is often easy to notice whether there are features in 
the data sets not satisfactorily reproduced. These data sets are then assigned some 
extra weight and, if needed, some previously freezed parameter can be freed and the 
minimization procedure repeated. In this way we have arrived at the weights listed in 
Table ffl 

The gluons at small x are only weakly constrained by the available data, and we 
have observed that some parameters easily drift — especially in the error analysis - 
to an unphysical region where shadowing would get stronger towards smaller nuclei. 
We cure this unwanted feature by adding a further term 

1000 [(y G (He) - y G (Pb)) - (y s (Re) - y s (Pb))} 2 (10) 

to the x 2 , which effectively drives the A-dependence, but not the magnitude, of Rq at 
the smallest-re similar to what there is in the sea quarks. We stress that inclusion of 
this term to x 2 does not affect the agreement with the data in any way, but it secures 
a systematic deepening of the gluon shadowing when going from lighter nucleus to a 
heavier one, especially so for the error-sets discussed below. 

2.5 nPDF Errors 

As mentioned earlier, one of the main goals of the present work is to develop a general- 
use tool for estimating the propagation of the nPDF uncertainties to any hard process 
quantity X. This brings the uncertainty analysis of nPDFs up to a similar level as in 
the free proton case. 

In this paper we will only consider uncertainties originating from the experimental 
errors in the fitted data. Although the set of parameters {a } giving the minimum \ 2 
represents our best estimate for the nPDFs, the presence of experimental uncertainties 
allows us to move in the neighborhood of {a } without immediately deteriorating the 
agreement with individual data sets. To find this relevant neighborhood and to quantify 
the relative differences to our central nPDFs properly, we adopt the Hessian method_|. 
The details of this this method are documented in Appendix |A], we summarize here 
only the main points. 

The Hessian method rests on a quadratic approximation for a departure of x 2 from 
its minimum Xo 

ij 1 3 ij 

which defines the Hessian matrix H. In constructing the expansion (TTTT) . special at- 
tention should be paid to multi-parameter correlations. If several parameters control 

2 For a weight determination in a neural network approach, see [39] . 
3 For alternative methods see e.g. Refs. [9l I40| . 



9 



the same parton type in a certain x-region, there are usually notable correlations be- 
tween them. This may lead to over-estimating the uncertainty in that region when 
applying the approximation in Eq. (II ip . In particular, changes in the valence quark 
modification Ry obtained by small variations of the Fermi-motion parameter (3 can be 
largely realized by varying x e and y e instead. In this sense (3 is redundant in the error 
analysis and we eliminate it from the expansion ffTTj) . The same argument applies to 
A-dependence of these parameters. The 15 final parameters considered in our error 
analysis are indicated in Table [2j 

A practical and numerically more stable way is to work not with the original pa- 
rameters {a}, but use a basis {z} which diagonalizes the Hessian matrix. Assuming 
linear propagation of uncertainties, the symmetric extreme deviation in any quantity 
X that corresponds to a given rise A% 2 in x 2 , can be written as 

( AX ) 2 ^iE( x (^ + )- x ^")) 2 ' ( 12 ) 

k 

where X(S£) and X(S^) denote the values of the quantity X computed by the nPDF 
sets S£ and , which are obtained by displacing the fit parameters in the z-space 
direction k such that x 2 grows by a certain fixed amount A% 2 . Lower and upper 
uncertainties IS.X ± can be computed also separately, using a prescription [JT] 

( A * + ) 2 « J2[ max { X ( S k)- X ( S % X ( S k)-^(S°),0}] 2 (13) 

k 

( A ^~) 2 « ^[max{A(5 )-X(^ + ),X(5 )-X(^),0}] 2 , 

k 

where S° denotes the central PDF set giving the minimum x 2 • All the errors shown in 
this paper have been computed by the latter method, with A% 2 = 50 which corresponds 
to a "90% confidence criterion" defined in Appendix [A] As discussed in Sec. [3] the 
Hessian matrix is computed for 15 fitting parameters in our analysis. According to 
the procedure explained above, this leads to 30 error-sets in addition to the best-fit 
central set. All these 31 nPDF sets needed for the evaluation of Eq. (1T3T) are included 
in our EPS09 release [IT]. This allows the computation of uncertainties in nuclear cross- 
section ratios, similar to those in Eq. ([?]), by any interested user. The corresponding 
calculation of uncertainty estimates for the absolute cross-sections (instead of ratios), 
will be explained later in Sec. |3J 

In this paper we do not discuss the possible shortcomings concerning our whole 
framework such as breakdown of pQCD factorization at small Q 2 or small/large- x 
corrections to the DGLAP splitting functions already at the nucleon-nucleon level 
collisions — for an extensive study of such issues, we recommend the reader to consult 
Refs. [321 EE]- Also, the presence of additional {Q 2 )~ n power corrections are expected 
to become increasingly important for large nuclei, see e.g. [HI 1351 I3SII3TI 135] . As we will 
see next, however, the present framework seems to work very well in the kinematical 
domain considered in this analysis. 
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3 Results and Discussion 
3.1 Final parametrization 



NLO 


Param. 


Valence R v 


Sea R§ 


Gluon Rq 


1 




Baryon sum rule 


0.785 


Momentum sum rule 


2 


Pyo 




-0.136 




3 




6.56 xlO 2 


8.74 xlO 2 


0.1 fixed 


4 


Px a 


0, fixed 


0, fixed 


0, fixed 


5 


x e 


0.688 


as valence 


as valence 


6 




0, fixed 


0, fixed 


0, fixed 


7 


Va 


1.05 


0.970 


1.207 


8 


Py a 


1.47 xlO 2 


-8.12 xlO 3 


2.72 xlO 2 


9 


Ve 


0.901 


1.076 


0.625 


10 


Pye 


-2.81 xlO 2 


as valence 


as valence 


11 


p 


1.31 


1.3, fixed 


1.3, fixed 


12 


Pp 


4.63 xlO" 2 


0, fixed 


0, fixed 


LO 






1 


Vn 


Baryon sum rule 


0.890 


Momentum sum rule 


2 


Pyo 




-8.03 xlO 2 




3 




6.84 xlO 2 


0.127 


0.1 fixed 


4 


Px a 


0, fixed 


0, fixed 


0, fixed 


5 


x e 


0.727 


as valence 


as valence 


6 


Px e 


0, fixed 


0, fixed 


0, fixed 


7 


Va 


1.04 


0.971 


1.096 


8 


Py a 


1.53 xlO 2 


-1.65 xlO 2 


4.57 xlO 2 


9 


Ve 


0.902 


1.062 


0.901 


10 


Pye 


-2.97 xlO 2 


as valence 


as valence 


11 





1.34 


1.3, fixed 


1.3, fixed 


12 


Pp 


5.07 xlO" 2 


0, fixed 


0, fixed 



Table 2: List of parameters defining the modifications Ry, Rg and Rq through Eqs. dU) 
and ([5]) at our initial scale Qq = 1.69 GeV 2 in NLO and LO. The final values for the 15 free 
parameters considered in the minimization procedure and in our error analysis are shown in 
bold face. The values for the parameters j3 and pp for valence quarks were fixed on the basis 
of a 17 parameter fit. 

In Table |2] we present the values of the 17 parameters resulting from the x 2 min- 
imization and those that we fixed. The error analysis was performed for the 15 pa- 
rameters shown in bold face. The corresponding nuclear modifications together with 
the uncertainty estimates at our initial scale Q$ = 1.69 GeV 2 and at a higher scale 
Q 2 = 100 GeV 2 are shown for Carbon and Lead in Fig. [3J We make the following 
observations for gluons, sea quarks and valence quarks respectively. 
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Figure 3: The nuclear modifications Ry, Rs, Rg for Carbon (upper group of panels) and 
Lead (lower group of panels) at our initial scale Ql = 1.69 GeV 2 and at Q 2 = 100 GeV 2 . 
The thick black lines indicate the best-fit results, whereas the dotted green curves denote the 
error sets. The shaded bands are computed from Eq. (fT5j) . 

At our parametrization scale Ql there are large uncertainties in both small-a; and 
large-x gluons. Only at moderate x the gluons are somewhat better controlled as the 
precision small-x DIS data — although directly more sensitive to the sea quarks - 
constrain the gluons at slightly higher x due to the parton branching encoded into 
DGLAP evolution. At higher Q 2 the small-x uncertainty rapidly shrinks whereas at 
large x a sizable uncertainty band persists. 
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The narrow throat in the sea quark uncertainty band at x ~ 10~ 2 . . . 10" 1 reflects 
the good constraining power of the precision DIS data. Towards higher x, the uncer- 
tainty grows as the accuracy of the DY data is not enough to decisively nail down 
nuclear modification for the sea quarks there. Note, however, that unlike in our earlier 
works, the parameter y e was free. Towards small x the errors are perhaps surprisingly 
small given that there are no direct data constraints. This is an artefact of the chosen 
form of the fit function as the tight constraints at x ~ 10 -2 . . . 10 _1 fix also the smaller- 
x behaviour leading to an unrealistically small uncertainty band there. This should be 
kept in mind when computing hard cross-sections involving very-small-x gluons. 

The valence quark modifications are well under control at x > 0.1 — thanks to the 
DIS data probing the valence quarks at this region. At small-rc, the baryon number 
sum rule together with the chosen form of the fit function halts the uncertainty band 
from growing, although there are no data that would directly constrain the valence 
quarks there. 



3.2 Comparison with the data 




Figure 4: The computed NLO R^ Y i x : M 2 ) (filled squares and error bands) as a function of 
X2 (left) and x\ (right) compared with the E772 [24] and E866 [25] data (open squares). 

In this section we demonstrate how the obtained NLO parametrization reproduces 
the data and how the nPDF uncertainties in Fig. prelate to the error bars of the data. 
In all figures we present, the open symbols with error bars denote the experimental 
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data and the filled symbols or thicker lines are the corresponding calculated values. 
The shaded light blue bands indicate the uncertainty estimates computed using our 
30 error sets in Eq. (1T3"]) . Unless otherwize stated, the point-to-point systematic and 
statistical errors have been added in quadrature. 

In Fig. H] we show the Drell-Yan data. As mentioned earlier, we have found that 
there is some dependence on the large-x sea quarks in the E772 data shown in the 
left panels. However, the apparent large scatter from one nucleus to another makes 
it difficult to well exploit this sensitivity and completely impossible to extract the A- 
dependence of the large- x Rs independently. The growing uncertainty band towards 
larger x is nicely in line with the growing large-x uncertainty for Rs in Fig. [31 We would 
like to emphasize two systematic features present in the data, which are well caught 
by our global analysis: The downward trend (shadowing) of the smallest-x points in 
the E772 data, and the diminishing nuclear effects towards larger invariant mass M in 
the E886 data. 

The DIS data in Figs. [5HH1 comprise the bulk of our experimental constraints. Since 
a large part of the DIS data points lies at relatively low Q 2 (see Fig. [2]), the good 
agreement between the data and the theory also verifies that our fit function is flex- 
ible enough. At higher Q 2 , the evolution effects become sizable and the fit function 
dependence is no longer as critical. 

The A-dependence of the nuclear effects is encoded in the parametrization through 
Eq. ([5]), and its determination is mainly possible due to the existence of DIS data for 
several nuclei. This dependence and its correct reproduction is obvious e.g. from Figs. [5] 



11 




I.I 

1.0 
0.9 
0.8 
1.2 

1.0 

3, 0.8 

I.I 
1.0 

0.9 
0.8 
1.2 
I.I 
1.0 
0.9 
0.8 



_ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
□ SLAC J - 

: A=4 


_ 1 i 1 i 1 i 1 i 1 i 1 i 1 i 1 i 1 i 1 _ 

- ■ EPS09NLO i : 

- A=9 


: "N, : 

: A- 12 ^ 


- A=27 


: \. : 

; A=40 


l A=56 


: A=108 

-,1,1,1,1,1, 


- A=197 J i j : 



0.5 



1.3 
1.2 
1.1 
1.0 
0.9 
0.8 
1.2 
1.1 
1.0 
0.9 
0.8 
1.2 
1.1 
1.0 
0.9 
0.8 
1.2 
1.1 
1.0 
0.9 
0.8 



Figure 6: The computed NLO ratios Rp 2 (x,Q 2 ) vs. Rp 2 (x,Q 2 ) and R^ ls (x,Q 2 ) (filled 
squares and error bands) compared with the NMC [23] (left) and SLAC [20] (right) data 
(open squares). 



and [6] in which data from several nuclei are gathered together for an easy comparison. 

In addition to the DY data, the scaling violation effects are clearly visible in the 
NMC data Figs. [7] and [U which display the Q 2 -evolution of the DIS structure function 
F 2 ratios. Thus, the DGLAP evolution plays a significant role in a successful description 
of these data. 

In Fig. [9] we finally show the nuclear modification for inclusive pion yield in d+Au 
collisions at midrapidity. We plot both PHENIX vr° [2H] and STAR vr + + 7r- [50] datcfl 
We are, at the moment, the only group that includes this type of data in the global 
DGLAP analysis of nPDFs. We observe, however, that the shape of the pionic -RdAu 
as a function of px is very well reproduced without any additional effects, indicating 
that the universality of the nPDFs works well. We thus find no reason to exclude these 
data from the global fit analyses. 

Typically, the error bands in most of the figures are roughly of the same size as 
(especially never badly overshooting) the error bars in the data, verifying that we have 
successfully propagated them to our nPDFs. In some cases, however, the size of the 
nPDF errors is apparently below the experimental uncertainties. This can happen for 



4 The STAR data was, however, not included in the fit due to poorer statistics and more restricted 
kincmatical reach. 



15 



0" 
o 

CO 

EC 



1.1 

1.05 

[.0 
(1.05 

0.9 
0.85 

0.8 

1.1 
1.05 

1.0 
0.95 

0.9 
0.85 

0.8 

1.1 
1 .05 

1.0 
0.95 

0.9 
0.85 

0.8 

1.1 
1.05 

1.0 
0.95 

0.9 
0.85 

0.8 

1.1 
1 .05 

1.0 
0.95 

0.9 
0.85 

0.8 
0.75 




1_ x=0.0175 



"Sfifl 



1 



10 




~- x=0.025 



. jj^i^J 3 --■ 



x=0.055 

i 1 1 m ill — 




n i;=0.125 

i i i m ill — 



^ i i i i 1 1 1 1 i i i i irrn: i i i i 1 1 1 1 1 i i i i 1 1 1 1 1 

m k 



1=0.55 

I I 

100 1 10 100 1 

Q 2 [GeV 2 ] 



10 100 



1.1 

1.05 

1.0 

0.95 

0.9 

0.85 

0.8 

1.1 

1.05 

1.0 

0.95 

0.9 

0.85 

0.8 

1.1 

1.05 

1.0 

0.95 

0.9 

0.85 

0.8 

I.I 

1.05 

1.0 

0.95 

0.9 

0.85 

0.8 

1.1 

1.05 

1.0 

0.95 

0.9 

0.85 

0.8 

0.75 



Figure 7: The calculated NLO scale evolution (solid black lines and error bands) of the ratio 
F2/F2 compared with the NMC data [27] for several fixed values of x. 

two reasons: First, in cases like Fig. [8] the small error bands result simply because 
there are other, more precise data sets probing the same x-region and, by construction, 
the size of our error bands are controlled by the most stringent ones. Second, in cases 
like the panels of Fig. O which compare Beryllium or Aluminium to Carbon, where 
cross-section ratio is taken between two nuclei very close in A, the errors in other 
parameters than those from the powers in Eq. ([5]) tend to cancel. 

3.3 Comparison with earlier global analyses 

In Fig. [10] we compare the results obtained in the present analysis EPS09 to the earlier 
NLO global analyses nDS [6] and HKN07 [5]. The comparison is shown at two scales, 
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Figure 8: The calculated NLO scale evolution (solid black lines and error bands) of the ratios 
F^/F® and F^/F® compared with the NMC data [27] for several fixed values of x. One 
percent systematic error has been added in quadrature with the statistical errors. 



Q 2 = 1.69 GeV 2 and Q 2 = 100 GeV 2 , overlaid on our error bands. Evidently, the main 
differences in the central fits are in the gluon modifications Rq, especially at small x 
and low Q 2 , but going to higher Q 2 the curves tend to gather closer together. The 
exception is the large-x region where significant differences persist. This comparison 
illustrates the large leftover freedom especially in nuclear gluon sector if only DIS and 
DY data are used as constraints. 

The majority of the discrepancies in Fig. [TU] originate from the adopted forms of 
the fit functions and assumptions made about the parameters in those x-regions which 
are not well constrained by the DIS or DY data. In particular, HKN07 and nDS 
did not include pion d+Au data which favors the presence of an EMC suppression 
for gluons. In order to demonstrate the importance of these data in constraining the 
large-x gluons, Fig. [TT] displays the pion -RdAu including also the prediction obtained 
by using the HKN07 NLO nPDFs. Although the HKN07 prediction remains largely 
within the experimental errors, the shape of the EPS09 curve is clearly more consistent 
with the data. The dominant partonic channel for inclusive pion production is this px 
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Figure 9: The computed i?dAu (thick black line and blue error band) at y = for inclusive 
pion production compared with the PHENIX [28] data (open squares). The error bars are the 
statistical uncertainties, and the yellow band indicate the point-to-point systematic errors. 
The additional 10% overall normalization uncertainty in the data is not shown. The data 
have been multiplied by the optimized normalization factor /jv = 1.03, which is an output 
of our analysis. Also the STAR data [50] (open circles) multiplied by a normalization factor 
/at = 0.90 are shown for comparison. 




Figure 10: Comparison of the average valence and sea quark, and gluon modifications at 
Q 2 = 1.69 GeV 2 and Q 2 = 100 GeV 2 for Pb nucleus from the NLO global DGLAP analyses 
HKN07 [5], nDS jfj] and this work, EPS09NL0 . 



18 



range the gluon-intiated one. Thus, the qualitatively contradicting behaviour between 
HKN07 and EPS09 in Fig. [TT] is inevitably linked to the differences in their nuclear 
effects for gluon PDFs. 




7 I I i I i I i I i I i I i I i L 

2 4 6 8 10 12 14 16 



p T [GeV] 

Figure 11: As Fig. [9] but also the prediction from HKN07 (NLO) is shown. The difference 
between EPS09 and HKN07 here demonstrates the constraining power of these data in pinning 
down the nuclear gluon PDFs in the mid-x and large- £ regions. 



3.4 Leading-order analysis 

Although the NLO analysis is the main objective in the present paper, we have also 
performed a new LO analysis to provide the tools for computing uncertainty estimates 
also in this widely-used framework. The LO framework is basically the same as in 
NLO, but the partonic cross-sections and DGLAP splitting functions are one power 
lower in a s , and we use the CTEQ6L1 [13] free proton PDF set as the baseline. The 
parameters specifying the LO fit are listed in Table [2j 

Due to the qualitatively similar behaviour of the LO and NLO fits, we do not 
present a detailed discussion of the LO results but only show, in Fig. [12], the resulting 
nuclear modifications for Lead at the initial scale Qq with their Ax 2 = 50 uncertainty 
bands. The same figure also presents the comparison to other LO analyses. We find 
that our error bands can accommodate most of the earlier analyses. So, in practice, 
using the EPS09L0 error sets all these different nPDF parametrizations are effectively 
covered. 

As can be read off from Table [1] there is practically no difference between the LO 
and NLO fits at the level of the values of x 2 obtained. Intuitively, however, the NLO 
fit should be better constrained than the LO one as there are more partonic channels 
open in the NLO DIS and DY cross-sections. Indeed this expectation is confirmed by 
Figs. [3] and [12] Especially the large- £ gluons are better constrained at NLO. 
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Figure 12: Comparison of the average valence and sea quark, and gluon modifications at 
Q 2 = 1.69 GeV 2 for Pb nucleus from LO global DGLAP analyses EKS98 [HE], EKPS [3], 
nDS [6], HKN07 [5], and this work EPS09L0. 

4 Application 

In this section we apply the obtained EPS09NL0 parametrization — the central set 
and 30 error sets — to a cross-section that was not included in the fit. Through this 
example we also want to demonstrate how our parametrization should be applied in 
practice. 

We consider here inclusive negative hadron h~ production at forward (pseudo) ra- 
pidities rj = 2.2 and rj = 3.2, in p+p and d+Au collisions, measured by the BRAHMS 
collaboration [37] at RHIC. In our previous article [I] we discussed how the suppres- 
sion observed in the nuclear modification i?dAu obtained from these data would strongly 
favour very deep gluon shadowing, and we searched for the strongest possible one that 
would still not contradict the available DIS and DY data. The analysis [I] was per- 
formed in a LO framework and we were forced to use fragmentation functions for 
average hadrons h + + h~ instead of charge-separated ones for h~ onlj@. In the cur- 
rent NLO setup we relax such simplification and employ the charge-separated NLO 
fragmentation functions by Sassot et al. |33j. 

We first investigate how well the NLO pQCD calculation can reproduce the shape 
and magnitude of the differential h~ yields measured by BRAHMS in p+p and d+Au 
collisions from which the nuclear modification -RdAu is computed. The inclusive yields 
are linked to the cross-sections by 

d 2 N™ min=bias 1 d 2 a™ d 2 N dA * min=bias (N coll ) ^d 2 a dAu 

dp T dy aj^ astic dp T dy ' dp T dy a^ astic dp T dy ' ^ ' 

where <r^ astlc is the total inelastic nucleon-nucleon cross-section and (N co n) is the 

5 The extraction of the charge separated fragmentation functions from p+p data is reliable only at 
NLO due to significant perturbative O (ai?) corrections. 
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average number of binary nucleon-nucleon collisions. Here we take 0"^ astlc = 40 mb 
[4T?] . and (iV co n) = 7.2 [37J, understanding that there is an inherent uncertainty in 
both: 

^inelastic ^ as no t been measured for y's — 200 GeV and determining its value 
requires extrapolation. In turn, (N co \\) is a mo del- dependent number resulting from 
a Glauber-model simulation performed by the experiment. However, both of these 
quantities are only multiplicative overall factors and do not affect the shape of the 
obtained distribution. 

Whenever comparing theoretical predictions to experimental data, we emphasize 
the importance of presenting also the theoretical uncertainties deriving from the PDF 
errors. For absolute cross-sections in p+p collisions these are obtained from Eq. ([13]) 
using the 40 error sets provided by CTEQ [12] . In the d+Au case there is an additional 
contribution from the uncertainties of the nuclear effects which we have quantified in 
the present analysis by the 30 error sets with the CTEQ central set as a baseline. In 
this case, we compute the d+Au cross-sections separately with 1+40+30=71 parton 
sets defined by: 



H s o 


X 


jCTEQ6.1M 
J Central set 


n s° 


X 


/•CTEQ6.1M 
J Error set ±1 




X 


/•CTEQ6.1M 
J Error set ±2 




X 


^CTEQ6.1M 
J Central set 


b 2 


X 


^CTEQ6.1M 
J Central set 



where the first one gives the central prediction, and othergj are pairwize contributing 
to the size of the upper and lower errors via Eq. (fl3"l) . 

We stress that this way of computing the uncertainty for d+Au cross-section is a 
simplification due to parametrizing the nuclear modifications on top of fixed proton 
baseline PDFs. For example, when the CTEQ error sets are multiplied by our central- 
set, the momentum is not strictly conserved. In other words, although the central 
prediction of our nPDFs is well-defined, the free and bound proton PDF analyses 
cannot be completely separated in the uncertainty analysis. The only way of obtaining 
completely self-consistent uncertainty predictions for d+Au collisions would require 
putting the free and bound proton PDF analyses together into a one single analysis in 
which the artificial separation to free proton uncertainties and those stemming from the 
nuclear modifications would not be required. In the absence of such master analysis, 
we recommend to calculate the uncertainties for absolute cross-sections as explained 
above. 

The results and comparison with the BRAHMS data are shown in Fig. [13], in 
which the blue bands indicate the PDF uncertainty range. The lower panels show the 

6 The uncertainty sets in CTEQ6.1M and EPS09 correspond to similar 90%-confidence criterion. 
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data/theory ratios. Looking first at the 77 = 2.2 panels, the flatness of the data/theory 
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Figure 13: Inclusive yield of negatively charged hadrons h~ in pp and dAu collisions. The 
experimental data shown by open squares is from [37] with statistical and systematical errors 
added in quadrature, and the horizontal bars indicate the pr bin. The blue band indicates 
the 90% confidence range derived from free proton and nPDF uncertainties. The calculated 
cross-sections have been averaged over the pr-bin width. 



ratio shows that the shapes of the measured distributions for both p+p and d+Au cases 
are well reproduced by the pQCD calculation. As these ratios are also very close to 
one, we conclude that the normalization factors cx^ astlc and (N co \\) are well estimated 
and that fixing the factorization scale to hadronic pt is also a valid choice. 

In the 77 = 3.2 panels, one first observes that in the case of p+p collision the 
agreement between the theory and the data is clearly worse and the experimental and 
PDF errors are only barely overlapping. Such mismatch is not, however, observed in 
the d+Au cross-sections. Thus, we conclude that if the nuclear modification 

1 d 2 N AA «/dp T dy min=bias ±d 2 a dA »/dp T dy 
dAu ~ (iVcoii) d 2 NPP/dp T dy d 2 aPP/dp T dy ' 1 ' 

is formed from these data, the result with r\ = 3.2 will inevitably lie below the pQCD 
prediction. This is demonstrated in Fig. [TH where we show the comparison of the 
BRAHMS i?dAu data with the pQCD prediction from our global analysis. We stress 
that this happens not because we would incorrectly reproduce the d+Au cross-section 
but because the NLO pQCD calculation seems to undershoot the p+p data. This 
is the reason why we do not include these data to the current global nPDF analysis. 
However, as Fig. [T3] shows, there is no reason why one should abandon the factorization 
for h~ in d+Au collisions. Presumably, more statistics would be needed to resolve the 
discrepancy at r\ = 3.2. 
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Figure 14: The computed nuclear modification ratio RdAu at forward rapidities (filled 
squares) for negatively-charged hadron production, compared with the BRAHMS data [37] 
(open squares). The error bars are the statistical and systematic uncertainties added in 
quadrature. The additional overall normalization uncertainty of 5% is not shown. The blue 
bands are computed using the EPS09NLD sets. 

5 Summary and Outlook 

We have presented a global analysis of the nuclear corrections to the free proton PDFs 
at NLO accuracy in pQCD. Since the quality of the obtained fit is excellent and since 
there seems not be any strong tensions between the fitted data set^j], we argue that 
our analysis supports the validity of collinear factorization in describing hard nuclear 
collisions such as DIS, DY and inclusive high-p^ pion production. The qualities of both 
NLO and LO fits are almost identical, x 1 /N pa 0.79, but the NLO analysis somewhat 
reduces the inherent uncertainties. This is contrast to the free proton analyses where 
the quality of the NLO fit is significantly better (see e.g. [51]). In the near future, 
with new data for inclusive hadron and direct photon production from RHIC and LHC 
experiments, factorization will be tested further. 

For the first time in the nuclear case, we have now also quantified the uncertainties 
originating from the experimental errors in a manner that we can release a collection 
of nPDF error sets. Using these 1+30 sets, the nPDF uncertainties can be propagated 
to any hard process of interest. In lieu of a still missing master analysis which would 
consistently combine both the free and bound proton PDFs, we recommend to estimate 
the uncertainties as follows: If an absolute cross-section a is considered the PDF and 
nPDF uncertainties should be combined independently, schematically as follows 

( Acr )Total = ( A(T )proton + ( A °")ePS09- 

For the nuclear cross-section ratios R, — like those in Eq. ([7]) — it should suffice to 
7 E.g. in EPS08 [4] there was a significant tension between the DIS and forward BRAHMS data. 
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consider only the nPDF errors, 



(A-R)Total - ( A - R )ePS09 

(see Sees. 12.51 and 0] for the details). In the latter case we do not recommend to 
include (Ai?)p roton , as this would lead to an overestimation of the errors. Moreover, 
for consistency EPS09 should be used in conjunction with the same free proton PDF 
set that was used in the analysis, namely CTEQ6.1M at NLO and CTEQ6L1 at LO. 

The most up-to-date free proton PDF analyses (e.g. [5H [52]) implements a some- 
what better organized method for treating the heavy quarks than the ZM-VFNS em- 
ployed in this analysis. Extension of the nPDF analysis to such general-mass scheme 
remains as a future work. Recently, the CTEQ collaboration has looked at the nu- 
clear PDFs from a slightly different point of view by fitting the cross-section data from 
neutrino- Iron scattering [53]. Interestingly, the central fit points to a somewhat dif- 
ferent nuclear modifications than what are obtained from electromagnetic DIS. Before 
making any strong conclusions about the possible difference, these data should be in- 
cluded in a global analysis and the possible tension between these and the other data 
sets carefully investigated. Also this is left as a future task. 
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Appendix 



A Error analysis 

In this appendix we explain the error analysis in more detail. The basic formalism we 
present here was introduced in the original CTEQ articles [10] , but we have taken 
guidelines also subsequent CTEQ and MRST papers e.g. [T3l 133] . 



A.l Hessian Matrix 

The Hessian approach is based on a quadratic expansion of x 2 — X 2 {{ a }) near its 
minimum Xo = X 2 { a °}i 



x 2 



where 5cii = a, — a°, and is the Hessian matrix defined here by 

1 d* x 2 



lj 2 ddiddj 



(17) 



Being symmetric, the Hessian matrix has a set of normalized eigenvectors and 
eigenvalues such that 

J2H^=e k v?\ Y.^f=^- (18) 
Introducing a new set of coordinates {z}, as 

<5a 4 = ^^'\/-^-, (19) 

j V £ j 

one finds that 

£fc = V^y^^ <faj, (20) 
i 

and 

X 2 ^Xo + $>- ( 21 ) 

i 

which implies that in the region where the quadratic approximation (IT61) is valid, the 
new coordinates {z} are independent — the x 2 increases uniformly in an arbitrary 
direction in the z-space. 
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A. 2 Error propagation 

Let us consider any quantity X that depends on the PDFs, that is X = X({z}). 
In a linear approximation, X may be expanded in the vicinity of its central value 
X = X({z = 0}) as X = X + AX, where 

- E © ^ < 22 > 

Since x 2 grows uniformly in all z-space directions, the z-space vector that extremizes 
AX for a given A% 2 , is in the direction of the gradient of X and has a length of a/ A% 2 . 
The components of this vector are 

-1/2 

<$* = y/Ax 2 ( ^ ) I > ; ( ^ ) , (23) 




giving 

\2 



'ax x 5 



(AX)L rcmum - A X 2 £ ) . (24) 



j 

In order to facilitate the computation of the derivatives dX/dzk in Eq. (T2"4"|) . we define 
auxiliary PDFs in several z-space coordinates: 



S 


= (0,0,0,. ..,0) 




sf 


= ±VAx 2 (1,0,0, 


...,o) 


si 


= ±^A X 2 (0,1,0, 


...,o) 


'n-i 


= ±v/A x 2 (0,0,... 


,1,0) 


q± 

N 


= ±VAx 2 (0,0,... 


,0,1) 



(25) 



The first set So is the central set giving the minimum x 2 , and in the other ones the fit 
parameters are changed by a fixed amount in each z-space direction separately. Using 
these PDF sets, the derivatives dXj dz^ can be approximated by a finite difference 



dX ^ X(S£) - X(So) „ X(S+) - Xffi 



dz k ±v/AP 2^/AP 



(26) 



where X(Sj^) denotes the value of X computed with the PDF set S^. Upon insertion 
to Eq. (J23D we get 
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Figure 15: This figure illustrates how the A% 2 in our NLO analysis grows when the fit 
parameters are scanned along the eigenvector directions. The shaded light blue band indicates 
the ideal quadratic behaviour Ax 2 = zf. We can see that the quadratic approximation is 
good but not perfect. 

These are the simplest equations by which the PDF uncertainties are propagated to 
any quantity X. 

However, often the quadratic approximation is not perfect, revealed by non-ideal 
behaviour of x 2 m the z-space. From Fig. [T5] such an behaviour can be concretely seen 
to some extent happen also in our NLO analysis. In such a case the growth of x 2 induced 
by a/ Ax 2 deviations in the eigenvector directions can be significantly different from 
the ideal A% 2 . Moreover, the eigendirections may exhibit very asymmetric behaviour, 
and the required deviations for x 2 to grow by Ax 2 may be significantly different on 
either side of the origin. This observation necessitates to modify the definition of the 
auxiliary PDFs to 

St = ±fe±(l,0,0,...,0) 

Sf = ±6z± {0,1,0,..., 0) (28) 

where 8zf and 5z^ are the deviations in positive and negative side of zth eigendirection 
respectively, that cause x 2 to grow by Ax 2 . Replacing the Eq. (12?]) by the following 
prescription [U] 

( A ^ + ) 2 « ^[max{X(5 fc + )-X(5°),X(^7)-X(^),0}] 2 (29) 

k 

(AX~) 2 « ^[max{X(5 )-X(^),X(5 )-X(5 fc "),0}] 2 , 

k 

serves as a way to obtain the upper and lower uncertainties separately. If the quadratic 
approximation for x 2 and the linear approximation for X were both perfect the Eq. (j27p 
and Eq. ff29l) would lead to identical results. 
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A.3 Choice of A X 2 ? 



In the derivation above, we assumed that there is a fixed A% 2 which sets a x 2 boundary 
beyond which the global fit quickly deteriorates. In a statistically ideal case, when the 
uncertainties of the data are Gaussian-type and the theory is known to be "right 
enough", the one standard deviation corresponds to A% 2 = 1 |49j . Such requirements 
can usually be met only if data from a very limited amount of sources are accepted - 
like PDF fits including only HERA data |43j . where the ideal choice A% 2 = 1 is indeed 
justified. 

However, in a truly global PDF analysis which, by definition, aims to combine data 
from as many independent experiments as possible, the situation becomes quite far 
from being ideal in the statistical sense: There are often sizable, unexplainable fluctu- 
ations within individual data sets and in practice all data sets are not mutually fully 
compatible. Thus, taking A% 2 = 1 in such analysis becomes practically meaningless. 

This is most obviously true also in the present analysis — combining all data from 
the lightest nuclei up to the heaviest ones — which poses an additional difficulty which 
is not met in the free-proton analysis: In lieu of large amount of data for each nucleus 
separately, we model the A-dependence of the fit parameters by a power law (j5J). In 
this way we can evidently catch the correct general trend, but there will inevitably be 
some small pull between individual nuclei in various places. Thus, the choice for A% 2 
should account also for our inability to treat the individual nuclei separately. 

Below, we first introduce somewhat extended statistical approach which is some- 
times used in determining A% 2 , and continue with a physically better motivated one, 
which we use in our error analysis. 



Statistically motivated method: 



Let us assume that the probability distribution for the global x 2 -function is a sim- 
ple Gaussian in the N- dimensional z-space 



Pd(M,N) = -^e- 2 / 2 . (30) 



1 

The probability P(Ax 2 , N) for X 2 increasing at most A% 2 is then obtained from 



r ( N \ /-Ax 2 

P(A X 2 ,N) = / [T\dzA P D (\z\,N)= dSP*\N,S) (31) 

</z2<Ax 2 / J ° 

, / ri \ N/2-1 

PS\N, S) = ^(f) e^. ,32) 

For usual "one standard deviation" P rs 68.3% and for N = 15 parameters, this ap- 
proach would suggest A% 2 w 17. This a way of choosing A% 2 has been routinely 
employed e.g. by Kumano et al. [S] and previously also by us [3]. 
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Physically motivated method: 



An alternative way for finding Ax 2 is to examine how the ^-contributions xt °f m " 
dividual data sets with Nk data points, behave in the z-space (see e.g. [13]). Loosely 
speaking, by determining acceptable intervals for each of the z-space directions, and 
taking a suitable average, we can arrive at a physically better motivated value for Ax 2 
than what is obtained from the idealized statistical way described above. 

Restricting the range how much individual ^-contributions x\ are allowed to grow 
above the central set value Xok> I s complicated by fluctuations in the data which can 
make Xok/Nk ^ 1) even if the overall agreement was visually good. In order to 
compensate for such effects and to treat all data sets equally, we follow the procedure 
suggested by the CTEQ and define normalized variables x\ n 

2 2 _ 2 | | /oo\ 

Xk Xk,n = Xk 2 • ( 33 ) 

Assuming that each x\ n follows the probability distribution of Eq. fl32l) with N = N k , 
the most probable value of xtn I s attained by the central set Sq. For each experiment 
we denote the largest acceptable xt n °y M(P, xt) and define it as a solution to equation 

/ dSP*(N k ,S) = P, (34) 

Jo 

for fixed confidence level P. Sliding each parameter z% across the parameter space and 
simultaneously monitoring the size of xtm we determine the range 

that fulfills the criterion 

XL < M(xt). (36) 



In Fig. [161 we display the P = 0.90 limits for eigendirection z 10 obtained by this 
procedure. The intersection flfcfz-^, zf^] gives us lower and upper bounds z~ and z^ , 
for which none of the individual ^-contributions exceed their 90% confidence limits. 
This intersection is the shaded band in the Fig. [161 

Finally, denoting the increase of x 2 m t ne boundaries by Ax 2 {zf) 1 the following 
average is taken as an overall estimate for A% 2 : 



, v A x 2 (zt) + A x 2 (z-) ^ v {zjf + 
^ 2N ~ 4^ 2N 



(37) 



where the latter relation would be exact in an ideal case with perfectly quadratic x 2 - 
Performing such a procedure we find Ax 2 ~ 50 — three times the value that would be 
indicated by the idealized statistical method. 
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Figure 16: The 90% confidence limits for different data sets in the eigendirection direction 
zio- The solid boxes indicate the locations of the minima for each No line is shown if the 
limits exceed the shown range. The global minimum is at z = 0. 



We note that it is also possible to take 8zf = zf in Eq. This choice is employed 
e.g. in a recent MSTW free proton analysis [51]. The benefit is that one circumvents 
the need for a single, "master" A% 2 — each direction Zj is treated separately. We 
have tested also this approach and, in comparison to the method described above, 
observed somewhat larger uncertainty predictions in those x-regions, where there are 
no stringent data constraints. This is caused by larger parameter intervals allowed 
in those directions which do not quickly meet the 90% confidence criterion, although 
the reproduction of several data sets gets worse, indicated by the approximate growth 
Ay 2 pd (5zi) 2 . In other words, A% 2 for the individual error sets would exhibit large 
differences among themselves. For this reason, we choose not to follow this precription, 
but construct the error sets in Eq. (1251) such that correspond to a fixed increase of 
Ax 2 = 50. 



A. 4 Calculation of the Hessian Matrix 

A practical problem in applying the Hessian method is how to obtain a reliable approx- 
imation for the Hessian matrix. Ideally, if the x 2 -function was smooth at the minimum 
and the A% 2 very small, we would obtain a useful Hessian matrix as a side product 
of the MINUIT minimization. In reality, our x 2 -function is a complicated functional of 
the fit parameters and due to the finite precision of the DGLAP solver and multiple 
numerical integrations, it generally becomes non-continuous [55] in small parameter 
intervals. Such numerical noise makes it impossible to obtain a reliable Hessian matrix 
from a general purpose fitting program like MINUIT. 

In order to obtain an estimate for the Hessian matrix, we adopt the following 
algorithm: 
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(a) Uncorrelated (b) Correlated 



Figure 17: Two illustrative examples how our x 2 - iunc ti° n may look in a two-parameter 
plane. The individual blue points are the exact x 2 values and the shaded surface results 
from the fit of Eq. (|38l) . In the figure (a), the two parameters are fairly uncorrelated and the 
X 2 increases almost uniformly in all directions. In the other case (b) there is a significant 
correlation between the two parameters revealed by the shallow "valley" going from a corner 
to another. 

1. Determine the parameter step sizes ±5ai from their central values a° that give 
an approximately fixed increase in \ 2 '■ 

2. Compute the x 2 values for each parameter pair in a two dimensional grid spanning 
the range (a*, a,j) G [a° — 5di, a° + 8ai\ x [a° — 5a j, a° + 5a j]. 

3. Fit the coefficients with quadratic function of the form 

f(a { , aj) = xl + cMi - a-) 2 + Cj{a.j - a°) 2 + ^(a, - af){aj - a°) (38) 

to the computed x 2_va l ues a t the grid points, and read off the Hessian matrix 
entries: Hij = Cy/2. The diagonal elements Ha are obtained by a similar fit but 
in one dimension. 

In Fig. [T7] we show two examples of how the x 2 typically looks as a function of two 
parameters. The shaded surface shows what is obtained from the fitted function of 
Eq. (I38p . From figures like this, we have verified how well the quadratic function (138]) 
fits the neighborhood of the Xo weu - These figures clearly reveal the two-parameter 
correlations too. 

Obtaining a trustworthy approximation for the Hessian matrix from the above 
fitting method, requires knowledge of the desired Ax 2 beforehand for adjusting the 
parameter intervals to be physically relevant: If too small or large step sizes are used 
to construct the Hessian matrix, the resulting quadratic approximation (|T6|) may be bad 
for the Ax 2 which will eventually be required by Eq. ([3~T]) . Therefore also this procedure 
is necessary somewhat iterative: To begin with, the Hessian matrix is computed by 
fixing the parameter step sizes to cause a fixed increase in % 2 , say 25. Following 
the procedure outlined in earlier Sec. IA.31 an appropriate Ax 2 is then determined. 
Inspecting the behaviour of x 2 as a function of the deviations in each eigendirections, 
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as shown in Fig. [15j one ma y judge whether the achieved quadratic approximation 
is valid or not near the obtained A% 2 . If needed, one may then try to adjust the 
parameter intervals in the computation of the Hessian accordingly, in order to improve 
the quadratic approximation. 



B How to speed up computation of pion production 

In this appendix we sketch the method to speed up the numerical calculation of inclusive 
pion production cross-section in d + Au collisions 

d(A- 1 ) + Au(K 2 )^7r(K 3 )+^, 

where K\ and K 2 denote the incoming momenta, and A3 the momentum of the observed 
pion. The cross-section can be written as a convolution of the PDFs /, the fragmen- 
tation function Di^ w for parton / to make a pion, and the partonic cross-section a 
[36] 

B3 <Md + Au_px) = zf*iJ*,f^f}i*i,A*)fft**A t ) (39) 



ijl 



n u ,,2 \„o d *(pi+p£-»P3>/4J 



d 3 p 3 



pi = X1K1 
p 2 = x 2 K 2 
p 3 = K 3 /x 3 



Here one faces a practical problem of significantly extended computing time consumed 
in evaluating the triple integrals, apparently making the fitting procedure — involving 
tens of thousands of calls — very slow. However, if the factorization scale /i 2 act is chosen 
not to depend on the partonic momenta, e.g. fixing all scales to a common scale /x 2 
proportional to the transverse momentum of the pion, one can define a set of weighting 
functions 

F^^) =j:jdx 1 J *^ti(x ut i*)D^ h (x a , p f m + ^ Pl ^ , (40) 

which can be evaluated and tabulated in advance as Fj(x 2 ) do not depend on the 
nuclear PDFs. After such preparations, the Eq. ( 1391) can be evaluated in advance up a 
to single integral 

^ J dx 2 F 3 {x 2 ^ 2 )f^{x 2 ^ 2 ). (41) 



E 



d?K 3 ^ 



.1 



With the above trick we overcome the problem of slowness and make the inclusion of 
hadron production data feasible also in NLO. We should also note that this method 
is possible as we take the free proton (and deuterium) PDFs as fully known, i.e these 
are not subject to iteration here. If this were not the case, only one integral could be 
predone in Eq. f[39|) . 
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